There are 10 questions to this activity. Save your answers in word document that you will hand in on Blackboard using a .pdf or .docx extension. Save your file with your last name first in the filename to help me keep track of files for grading. Keep your script file in your GitHub folder and make sure that all changes are pushed to GitHub. Code for all questions should be clearly commented with the question number. You will include a link to your script file as a part of your final question in this activity. The assignment is worth 30 points. Include any plots you make for a question in your answers.
As air temperatures rise under global climate change, areas covered in snow and ice are melting. Glaciers form when more snow accumulates then melts or evaporates and the snow becomes compacted over time. Glaciers are defined as being persistent over time (don’t disappear seasonally or even every few years) and must be at least 0.1 km squared in area, and often take thousands of years to form. Glaciers often form in mountainous regions. Glacier National Park is located in Montana and known for sweeping views of glacier covered mountains. The glaciers in Glacier National Park formed at least 7,000 years ago. However, the glaciers in Glacier National Park began noticeably shrinking over the last century. There is concern about impacts on tourism, ecosystems, and hydrology as the glaciers disappear. As shown below, the receding glaciers are noticeable and changing the entire landscape.
Glacial retreat will drastically affect the regional hydrology, vegetation, and the energy balance. For example, a surface with snow or a glacier reflects more incoming solar radiation off of the earth’s surface compared to bare rock and vegetation nearby. The loss of glaciers means that the land surface will experience even greater warming as more solar energy is absorbed. As glaciers retreat and overall snow and ice cover decreases in mountainous regions, vegetation can grow in newly formed bare surfaces. You can see in the picture above, there is grass growing in areas that were once covered in snow and ice. This growth of vegetation can drive additional change on the land surface and alter water, energy, and carbon cycling.
In this exercise, you’ll use data from the National Park Service that shows the footprint of 39 glaciers in 1966 and 2015 to show the change in glacier extent over recent decades.
You will use a series of spatial packages. The rgdal package is good for handling spatial data types. sp has a lot of operations for working with vector data. raster is for working with raster data. Install the following packages:
install.packages(c("sp","rgdal","dplyr"))
Note that if you are using a mac, you may get some package installation errors. Please refer to the posted troubleshooting document to address these errors. These packages rely on a library in your operating system called gdal, but is is not always installed or easy for R to find on some versions of mac os.
#package for vector data
library(sp)
#package for reading in spatial data
library(rgdal)
#data manangement package
library(dplyr)
You will start by reading in the vector data of the glacier outlines. Shapefiles contain vector data and end with the .shp extension. Shapefiles aren’t actually a single file, but an assemblage of multiple files with the same name and a different extension (ex: .prj or .dbf). This helps convey all of the data for the shapefile.
These glaciers have been digitized by the National Park Service. In shapefiles, all metadata is contained in the .xml file. You can open it in a web browser to find more information. The GNP_glaciers file contains all four shapefiles for each year. Read them in using the readOGR function.
#read in shapefiles
#readOGR in rgdal does this
#glaciers in 1966
g1966 <- readOGR("/Users/hkropp/Google Drive/teaching/2020/Fall 2020/EnvDataSci/activity/data/activity 6/GNPglaciers/GNPglaciers_1966.shp")
## OGR data source with driver: ESRI Shapefile
## Source: "/Users/hkropp/Google Drive/teaching/2020/Fall 2020/EnvDataSci/activity/data/activity 6/GNPglaciers/GNPglaciers_1966.shp", layer: "GNPglaciers_1966"
## with 39 features
## It has 13 fields
## Integer64 fields read as strings: OBJECTID
#glaciers in 2015
g2015 <- readOGR("/Users/hkropp/Google Drive/teaching/2020/Fall 2020/EnvDataSci/activity/data/activity 6/GNPglaciers/GNPglaciers_2015.shp")
## OGR data source with driver: ESRI Shapefile
## Source: "/Users/hkropp/Google Drive/teaching/2020/Fall 2020/EnvDataSci/activity/data/activity 6/GNPglaciers/GNPglaciers_2015.shp", layer: "GNPglaciers_2015"
## with 39 features
## It has 21 fields
## Integer64 fields read as strings: OBJECTID Recno
Note there are 39 features (glaciers) for each shapefile and a table with information about each glacier that has 13 columns of data. Let’s investigate what the format of this data.
str(g2015)
There is a lot of information including the coordinates for the polygons and settings for drawing the polygons. Let’s take a look at what one of the glacier data look like. The shapefiles that we read in were automatically classified as spatial polygons. When you tell R to make a plot of the data, it will be recognized as making a map for a spatial dataset. Let’s look at the original glacier footprints from 1966.
#map the glaciers filling in the polygons with light blue and making the borders grey
plot(g1966, col = "lightblue2", border="grey50")
There are a few large glaciers, but many look like small dots on the landscape. It is hard to get more context for these glaciers without knowing the geography of Glacier National Park. We will use some satellite imagery to get a better context shortly.
One of the main features of a shapefile is that a table of data for each spatial feature can be stored along with the spatial features. You can preview the first 6 lines and columns of this data table using the head function.
#data stores all accompanying info/measurements for each spatial object
head(g2015@data)
## OBJECTID Recno Year Src1_0822 Src1_nadir Src3_nadir Src2_nadir Area2015
## 0 1 1055 2015 secondary 11.7240 0.0000 4.7346 736669.75
## 1 2 1202 2015 primary 27.0008 0.0000 0.0000 511589.79
## 2 3 1120 2015 secondary 11.7240 0.0000 4.7346 75562.60
## 3 4 1430 2015 secondary 0.0000 16.8225 0.0000 1498505.92
## 4 5 1041 2015 primary 11.7240 0.0000 26.2766 35298.01
## 5 6 1132 2015 secondary 11.7240 0.0000 4.7346 224773.89
## Shape_leng X_COORD Y_COORD SOURCE
## 0 7741.335 268429.9 5425167 WorldView-01 Satellite imagery from Digital Globe
## 1 5184.650 295750.8 5413701 WorldView-01 Satellite imagery from Digital Globe
## 2 1255.828 268868.3 5421731 WorldView-01 Satellite imagery from Digital Globe
## 3 15290.128 303278.4 5385710 WorldView-01 Satellite imagery from Digital Globe
## 4 2259.001 273823.9 5427266 WorldView-01 Satellite imagery from Digital Globe
## 5 4090.754 276459.4 5419663 WorldView-01 Satellite imagery from Digital Globe
## CLASSIFICA OWNERSHIP SrcOth_nad SrcOth_dat
## 0 main body of glacier Glacier National Park 0.0000 9999/01/01
## 1 main body of glacier Glacier National Park 18.6183 2014/10/19
## 2 main body of glacier Glacier National Park 0.0000 9999/01/01
## 3 main body of glacier Glacier National Park 0.0000 9999/01/01
## 4 main body of glacier Glacier National Park 0.0000 9999/01/01
## 5 main body of glacier Glacier National Park 0.0000 9999/01/01
## COMMENT Src2_0912
## 0 used both 20150822 and 20150912 images for full coverage primary
## 1 20150822 only 2015 image, but high nadir, also used 20141019 image n/a
## 2 20150822 used where shading was heavy in 20150912 image primary
## 3 used color 20150822 as secondary, some offset from 20150925 image n/a
## 4 used image from 20150822 since later image had high off nadir angle secondary
## 5 no comment primary
## Src3_0925 GLACNAME SOURCE_SCA
## 0 n/a Agassiz Glacier 1:12000
## 1 n/a Ahern Glacier 1:12000
## 2 n/a Baby Glacier 1:12000
## 3 primary Blackfoot Glacier 1:12000
## 4 n/a Boulder Glacier 1:12000
## 5 n/a Carter Glacier 1:12000
You will notice here that I refer to components of the spatial dataset using the **@** symbol. This is a specific feature of spatial data in R. This table allows for data describing a spatial object to be stored. You can work with this table just like any other dataframe, but you will be able to display objects in it spatially. You can see that there are names for the glaciers, area, and other information about the location and features of the glaciers.
For a vector object, you can find the projection info in an object called proj4string.
g1966@proj4string
## CRS arguments:
## +proj=utm +zone=12 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m
## +no_defs
Before you get started, there’s one other issue. The glacier names in 2015 don’t match the rest of the shapefiles. Compare 2015 to 1966 for reference:
#check glacier names
g1966@data$GLACNAME
g2015@data$GLACNAME
You’ll notice Miche Wabun is missing Glacier in its name and North Swiftcurrent Glacier should be abbreviated as N.. Fix that so you can use the glacier names to match up the glaciers between both years.
#fix glacier name so that it is consistent with the entire time period
g2015@data$GLACNAME <- ifelse(g2015@data$GLACNAME == "North Swiftcurrent Glacier",
"N. Swiftcurrent Glacier",
ifelse( g2015@data$GLACNAME == "Miche Wabun",
"Miche Wabun Glacier",
as.character(g2015@data$GLACNAME)))
It’s hard to get a reference for the glaciers just from the polygon file. I’ll add a basemap of an image collected by the Landsat satellite on September 5, 2014 for reference in the activity. Note that this is a single snapshot of the land surface on September 5. Don’t forget that glaciers are well defined geologically. You may see snow surrounding a glacier in the image that is not necessarily part of the glacier. You can also see the edge of the city of Kalispell and several small towns in the lower left corner of the image. The spatial resolution of this imagery is 30m x 30m. This means that every pixel covers 30 x 30 meters of land. The images will provide some reference for the location. The landsat sensors measure the light that is reflected up to the top of the atmosphere. If we overlay the red, green, and blue visible bands of light, it will look just like a photograph of the earth surface. We won’t get into working with raster data until next exercise so I will show you an overlay with the glacier polygons so that you are not overwhelmed.
It is a little hard to see the details at the scale of the entire park. Let’s zoom in on the area with a few glaciers:
Next let’s analyze the area of the glaciers between 1966 and 2015. There is an area column in the glacier data tables, and the metadata indicates that the area is in meters squared.
Note: that sometimes re-sizing plots with multiple sources of spatial data can have issues re-sizing all data layers. If it looks like the extent doesn’t match after re-sizing, rerun your code again after re-sizing. It also may take a minute for plots or raster data to render on your screen.
It will be helpful to join all of this data together into a table not associated with the shapefile so that you can analyze it. Here you will use the join from dplyr. Let’s also make a dataframe with only the information we are interested in.
#lets combine area, first work with a smaller data frame
gdf66 <- data.frame(GLACNAME = g1966@data$GLACNAME,
area66 = g1966@data$Area1966)
gdf15 <- data.frame(GLACNAME = g2015@data$GLACNAME,
area15 = g2015@data$Area2015)
#join all data tables by glacier name
gAll <- full_join(gdf66,gdf15, by="GLACNAME")
Below is an image of the different type of joins from the dplyr documentation. Note that you used a full join. Below is an image of the dplyr cheatsheet section on joins. You can also find more documentation on mutating joins in dplyr here:http://lindsaydbrin.github.io/CREATE_R_Workshop/Lesson_-_dplyr_join.html
Finally let’s calculate the percent change in area across the glaciers.
#calculate the % change in area from 1966 to 2015
gAll$gdiff <- ((gAll$area66-gAll$area15)/gAll$area66)*100
Let’s make a map where you color code the % glacial loss between 2015 and 1966. There is a function in sp called spplot that will color code spatial data based on a column of data within the data table of the spatial object. First you will need to add the % change back into the g1966 dataframe. Then you can use the spplot function. This will make a map that fills in your polygons with a color representing the percentage of glacial loss.
#join data with the spatial data table and overwrite into spatial data table
g1966@data <- left_join(g1966@data, gAll, by="GLACNAME")
#use spplot to shade polygons based on the % change of labels
#first argument is the spatial object
#second is the column in of data to display with the different colors
#add a title using main
#col changes the color of the borders. This argument sets them to transparent
spplot(g1966, "gdiff", main="% change in area", col="transparent")
In this last part of the exercise you will work on your interpretation skills and use your R coding knowledge to help you summarize your data. It is often helpful to include some basic summary statistics and describe ranges of data in your data interpretation. There is just one last piece of information that might be helpful. You can subset spatial data a lot like you would a dataframe. This is helpful becuase you can subset it using the logical conditions that you have been using so far. Here’s an example.
#look at the Vulture glacier in 1966
vulture66 <- g1966[g1966@data$GLACNAME == "Vulture Glacier",]
plot(vulture66, main = "Vulture Glacier in 1966", col="slategray")